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Amendments to the Claims ; 

This listing of claiim will replace all prior versions, and listings, of claims in the 
application. 

Listing of Claims : 

1. (Currently Amended) A method of extracting new word automatically, said 
method comprising the steps of: 

segmenting a cleaned corpus to form a segmented corpus; 

splitting the segmented corpus to form sub strings, and counting the occurrences 
of each sub strings appearing in the corpus; and 

filtering out false candidates to output new words; 

wherein the segmenting and the splitting is not dependent upon word boundaries . 

2. (Original) The method of extracting new word automatically according to 
Claim 1, wherein the step of segmenting comprises using punctuations, Arabic digits and 
alphabetic strings, or new words pattems to split the cleaned corpus. 

3. (Origixial) The method of extracting new word automatically according to 
Claim 1, wherein the step of segmenting comprises using common vocabulary to segment 
the cleaned corpus. 



-2- 



PAGE 8/17* RWD AT 7/111/2006 10:14:56 PM [Eastern 



07-10-' 06 21:09 FBDM- 412-741-9292 T-924 P009/017 F-593 

Any. Docket No. JP920000191US1 

(590.079) 

4. (Original) The method of extracting new word automatically according to 
Claim 1, wherein the step of splitting and counting is implemented using a GAST. 

5. (Original) The method of extracting new word automatically according to 
Claim 4, wherein a GAST is implemented by limiting length of sub strings. 

6. (Original) The method of extracting new word automatically according to 
Claim 1 , wherein the step of filtering out false candidates comprises: 

filtering out functional words; 

filtering out those sub strings which almost always appear along with a longer sub 
strings; and 

filtering out those sub strings for which the occurrence is less than a 
predetermined threshold. 

7. (Original) The method of extracting new word automatically according to 
Claim 1, wherein the step of segmenting the cleaned corpus comprises using pre- 
recognized functional words as segment boundary patterns. 

8. (Original) The method of extracting new word automatically according to 
Claim 3. wherein the step of segmenting cleaned corpus comprises using pre-recognized 
functional words as segment boundary patterns. 

9, (Original) The method of extracting new word automatically according to 

Claim 3, wherein the step of filtering out false words comprises: 
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filtering out functional words; 

filtering out those sub strings which almost always appear along with a longer sub 
strings; and 

filtering out those sub strings for which the occurrence is less than a 
predetermined threshold. 

10. (Currently Amended) An automatic new word extraction syste^^, 
comprising: 

a segmentor which segments a cleaned corpus to form a segmented corpus; 

a splitter which splits the segmented corpus to form sub strings, and which counts 
the number of the sub strings appearing in the corpus; and 

a filter which filters out false candidates to output new words; 

wherein the segmentor and the splitter is not dependent upon word boundaries , 

11. (Original) The automatic word extraction system according to Claim 10, 
wherein the segmentor uses punctuations, Arabic digits and alphabetic strings, or new 
word pattern to segment the cleaned corpus. 

12. (Original) The automatic word extraction system according to Claim 10, 
wherein the segmentor uses conunon vocabulary to segment the cleaned corpus. 
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13. (Original) The automatic word extraction system according to Claim 10, 
wherein the splitter builds a G AST, 

14. (Original) The automatic word extraction system according to Claim 13, 
wherein the CAST limits the length of sub strings. 

15. (Original) The automatic word extraction system according to Claim 10, 
wherein the filter filters out functional words; those sub strings which almost always 
appear along with longer sub strings; and those sub strings for which the occurrence is 
less than a predetermined threshold. 

16. (Original) The automatic word extraction system according to Claim 10, 
wherein the segmentor uses pre-recognized functional words as segment boundary 
pattems. 

17. (Original) The automatic word extraction system according to Claim 12, 
wherein the segmentor uses pre-recognized functional words as segment boundary 
pattems, 

18. (Original) The automatic word extraction system according to Claim 12, 
wherein the filter filters out functional words; those sub strings which almost always 
appear along with a longer sub strings; and those sub strings for which the occurrence is 
less than a predetermined threshold. 

19. (Currently Amended) A program storage device readable by machine, 

tangibly embodying a program of instructions executable by the machine to perform 
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method steps for extracting new word automatically, said method comprising the steps 
of: 

segmenting a cleaned corpus to form a segmented corpus; 

splitting the segmented corpus to form sub strings, and counting the occurrences 
of each sub strings spearing in the corpus; and 

filtering out false candidates to output new words; 

wherein the segmenting and the splitting is not dependent upon word boundaries . 
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